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Description 

BACKGROUND 

The invention relates to ordering groups ot text in an image. 

on, P ?T d °^ Umen,S Can be scanned and s,ored as 'mages in a computer. Text recognition techniques, such as 

S^SSJUS^ (OC l' ,h6n ^ l ° C ° nVert 16X1 " ,h6Se ima96S "° 3 C ° m P^er-editable forma,, 
Sto^hTS charac,ers ' Scanned lma 9es can contain text organized in multiple, distinct blocks (e.g. , multiple columns 

bank h' CaP K 0n ?• l°° ,n ° ,eS ' ,00tefS) ThS ,eXt b,OCkS may ,urther be se P arated b V re>atte ly , arg e areas of 

JSl P h k Tl 9raP ° bjeC,S ( ' ineS ' PiCtUreSl and S ° f ° rth) Text can also be surrounded by a frame or contain 
insets, which further separate the text into blocks. Although a person reading the paoe may be able to recognise The 
proper order o, the texl blocks in the image, i, may be difficult for an OCR program to identify the text (by discarding 
the non-text components such as blank spaces and graphical objects) and then group the text into the proper reading 

SUMMARY 

«tJ£ 9eneral ' ^ 0ne aspect ' ,he inven,ion ,ea, "res a computer-implemented method of ordering text in an image 
stored m a computer. Text ,s grouped in multiple regions. The text regions are represented as a graph having vertices 
and edges. An optimal Hamiltonian path through the vertices is calculated, and the text regions are ordered according 
to the calculated optimal Hamiltonian path. s 

H P finZ' em T tati0nS 7 inVen,i ° n inC,Ude ° ne ° r m ° re ° f ,he followin 9 ,eatures The representing step includes 
^fining each group of ext as a vertex in a graph, defining edges between the vertices, and assigning weights to the 
edges. Directed pairs of edges are defined between any two vertices. An optimal Hamiltonian path through the vertices 
1,^h h °k . aSS !? n6d ed9e WSi9htS accordin 9 b V solvin 9 f ° r a raveling salesman problem. The weights 

rXfotl T eSn V6rtiCeS baSSd ° n ,he diS,anCe be,Ween an V ,wo ,ext ^9™*, the text character- 

ises of the corresponding text regions, and the existence of non-text separators between text region pairs 

In general, ,n another aspect, the invention features a program residing on a computer-readable medium for or- 
Sin TV"?" ' mage S, T d 3 compu,er The P r °9ram includes instructions for causing the computer to group the 
text m multiple regions and to represent the text regions as a graph having vertices and edges. The text regions are 
ordered according to a calculated optimal Hamiltonian path through the vertices 

stom ae 9 me e ri a l !m n ,o n ^ ,her t a h SPeC, • ™ a PP ara,us ,or recognizing ,ex, in an image that includes a 

7. 1 6 ' ma9e 8nd 3 P focessor operatives coupled to the storage medium. The processor is 

FurtJrm , T P 16X1 ' n mUl,iP ' e regi ° nS and '° repr6Sent ,he texl re 9 ions as a 9raph having vertices and edges 
Further, the text regrons are ordered according to a calculated optimal Hamiltonian path through the vertices ' 
The invention has one or more of the following advantages. The proper order of multiple, distinct blocks of text in 
a captured image can be determined reliably by a text capture program. 

claims'^ fSatUreS and adVaPtageS ° f the mention will become apparent from the following description and from the 
BRIEF DESCRIPTION OF THE DRAWINGS 

Fig. 1 is a flow diagram of a text capturing and ordering program in accordance with the present invention 

r-ig. 2a is a diagram ot text in an image separated into blocks. 

Fig. 2b is a diagram of vertices representing the text blocks of Fig 2a 

edge?ad?acem aTrvet;e 9 s ,he '° the ™ blOCks « ^ * a *ng with oriented pairs of 

Fig. 4 is a diagram of an optimal Hamiltonian path through the vertices in the graph of Fiq 3 
Fig. 5 is a diagram of a text block. 

Fig. 6 is a flow diagram of a process for separating a page into independent parts. 
Figs. 7, 8, and 9 are diagrams of text blocks in page parts. 
Fig. 10 is a block diagram of a computer system. 

DETAILED DESCRIPTION 

j«™9 to Fig 1 a computer implemented text capturing program is described that can reliably identify the proper 
iST nJTZ°" 9 ,n,0 ^ U i ,t ;P ,e ' distlnct blocks « - image The program first captures and stores an image (step 
102). Next, the program .dent.fies the text blocks in the page based on conventional page layout analyses (step 04) 
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For example, the image can be represented as density histograms, with very dense regions indicating non-text objects, 
such as graphical objects, and very sparse regions indicating gaps. Alternatively, the identification ot text blocks can 
also be based on such factors as the proximity of the text blocks to each other, font size, and the existence of space 
separators and blocks of graphical objects. Thus, for example, although text characters in a page may be horizontally 

5 aligned, they may be separated by a wide gap, indicating that the characters are located in two different columns. In 
addition, the section heading for the page of text may have a different, larger font than the remaining text. The text 
characters may also be separated by graphical objects interspersed throughout the page. 

After the text blocks have been identified in the image, the program separates the text blocks into independent 
subsets or parts of the page, if possible (step 106) Many pages can be divided into smaller parts that are divided by 

io certain types of separators. These independent parts can be processed separately by the program, thereby reducing 
the complexity of finding the order of the text blocks in a page. Steps 108-116 in Fig. 1 are performed separately for 
each identified independent part of the page. 

To further reduce the complexity of finding the order of the text blocks in each part of the page, the program next 
combines text blocks where possible (step 108). Often there is only one way to order two or more text blocks. In such 

15 cases, the blocks can be combined into a new single text region. 

In the exemplary part 200 of a page in Fig. 2a, the darkened boxes 202 and 204 correspond to non-text objects, 
such as graphical objects. Further, a vertical divider line 206 separates text. In this image, the identified text blocks 
are labeled as text blocks 1-8. In each page part, the program then designates each text region (a region can be one 
text block or a group of combined text blocks) as a vertex of a graph (step 110). In Fig. 2a, text blocks 1 and 2 can be 

20 combined into one text region 12 and text blocks 6 and 7 can be combined into one text region 67. Thus, in Fig. 2b, 
vertices V 12 , V 3 , V 4 , V 5 , V 67 , and V 8 are designated for the text regions in Fig. 2a. The positions ol the vertices are not 
necessarily geometrically related to the locations of the text blocks 1 -8 in the image 200. 

Next, the program defines directed edges (Vj, Vj) and (Vj, Vj) for each pair of vertices Vj and Vj (step 112). A pair 
of directed or oriented edges is defined between any two vertices because of the possibility that, as between any two 

2S text regions, one text region may come before the other text region. The vertices V 12 , V 3 , V 4 , V 5 , V 67 , and V 8 along 
with the directed edges between each of the vertices define a directed or oriented graph G, as shown in Fig. 3. 

The relationships between the vertices V are then defined (step 1 1 4) by assigning edge lengths (or weights) to the 
directed edges (Vj, Vj) and ( Vj, Vj), based on a number of factors. These factors include the distance between any two 
text blocks, the characteristics (e.g., number of lines, font size, spacing) of two text blocks, and the existence of sep- 

30 arators (such as empty space or other non-text objects) between the text block pairs. The edge lengths are based on 
the likelihood that one vertex V ; comes before its adjacent vertex Vj. The higher the likelihood that text region i comes 
before text region j, the smaller the weight of edge (Vj, Vj), and vice versa. 

Thus, for example, in Fig. 3, the weight assigned to the edge (V 12 , V 3 ) is much smaller than the weight assigned 
the edge (V 3 , V 12 ) because it is much more likely that text region 12 comes before text region 3. 

35 Next, using the weights determined for the edges of the graph, the program finds an optimal Hamiltonian path 

through the vertices V 12 , V 3 , V 4 , V 5 , V 67 , V 8 by using brute force (for small graphs) or conventional heuristic or approx- 
imate methods that solve a traveling salesman problem (step 116). An identified optimal Hamiltonian path is shown in 
Fig. 4, with the path starting at V 12 and continuing to vertices V 3 , V 4 , V 5 , V 67 , and V 8 successively. Next, the program 
combines the partial orders found for the corresponding parts of the page into a final order n (step 118). 

40 The following mathematical model is defined to perform the text ordering process. Referring to Fig. 5, for a text 

region A with coordinates (T,B,L,R) in a two-dimensional X-Y space, let 
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Top (A) T, 

Bot(A) = B, {Eq , 

Lft(A) - L , (Eq ' 11 

Rgt(A) = R, 



and 



CntrX(A) = ( L - R)/2 t 

CntrY(A) - (T+B)/2, { q ' ] 

where L and R are on the X axis and T and B are on the Y axis. The distance between any two text regions A1 and 
A2 is defined as 
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\Al,A2\ = \CntrX(Al) - CntrX{A2) \ + 

\Cncr-/{Al) - CntrY{A2)\ (Eq - 3) 



Ai ll ^lt ,h text regwns A,, A,, a precedence (unction f(Ai,Aj) is constructed so that, the more likely Ai precedes 

rlZtTTL V T Q ° f <(A, ' Aj) F ° r K 16X1 re9i0nS ' the P^cedence tunctions t(Ai.Aj), i=1-K, j=1-K *i are cal- 
culated, which are used to calculate the edge lengths or weights between vertices 

bv seo^inn ,hfnl ,he ^f enCe ,UnCtions are constructed, the comp.exity of the problem is reduced (1 ) 

by separating the page into different parts; and (2) by combining text blocks into regions, where possible 

irder^n^ K ' *?■ ^ ^ °' SP ' i,,inQ ^ ^ into muKiple P artS is Ascribed. A page can be split into 

Si . k V T P ' yin9 me °"° Win9 rSCUrSiVe a,9 ° rilhm - The pr °9 ram crea,es a se « S P °< P^e Pans B and 

ooks for 1 1ST , °' e 35 6lement ° f ,he S6t SP (S,e » ^ For each e,e ™"> h 8R »e program 
looks for a spirt ,ng separator going through the existing element (step 302), where a splitting separator may be deLd 

fsteo %oT' h re9 '° n eXCePt 3 VertiCa ' ' ine ' WhiCh mi9h ' bS 3 COlu ™ -P-ator. 'f nospllg separator isSund 

Slme^L . , P r eS , S ? S,OPPed " 3 SPimin9 Separa, ° r iS ,OUnd ' ,he current e,eme "« * ™<L -n'o 2 new sub 
elements by splitting ,t along the selected separator (step 306). The current element is replaced by two new sub- 
e.ements and steps 302-306 are repeated. The sub-e.emen.s form the parts Pj of the page ?hus SP p? ?! P 1 

rend 6 ,rde°Lrmine 9 theTna! Z£T" " ^ * *" ^ ** «"* ^ ™^ 

* rf > "I°nr^!n the COmf>, °*y in . each P a 9 e P art ' ,wo °< ™ fe »*xt blocks or regions can be combined (step 1 08) if they 

F o JU^JS C ° me h ° r VertiCa " y C ° nneC,ed ■* TW ° ,eX ' re9i ° nS A1 ■ A2 are cal,ed h°'izonta,fy connected (see 
rig. 7) if the following conditions are true: v 

(1) A1 and A2 are horizontally aligned, that is, 

max(Top(A1),Top(A2))<min(CntrY(A1),CntrY(A2)), 
and 

min(Bot(Al ),Bot(A2))>max(CntrY(Al ),CntrY(A2)); 

(2) no other region overlaps a common bounding box of A1 and A2' 

A3 above A?,nH ^^Kr 1 1"" tOP ' WhiCh meanS thGre n ° regi ° nS ab ° Ve A1 and A2 or the "« region 
A3 above A1 and A2 is a blocking region, that is, 

Lft(A3)<min(Lft(A1),Lft(A2)), and 
Rgt(A3)>max(Rgt(A1 ), Rgt(A2)); and 

Siow r an A d s?£££Z£r that is ' ,here are n ° re9ions be,ow A1 a - * « - — «*» -a 

If the regions A1 A2 are horizontally connected, their partial order is from the left to the right (from A1 to A2 in Fiq 7) 
Two text reg,ons AI, A2 are vertically connected (see Fig. 8). if the following conditions are true 9 

(1) AI and A2 are vertically aligned; that is, 

max(Lft(A1),Lft(A2))<min(CntrX(A1),CntrX(A2)) 

, and 

min(Rgt(A1 ),Rgt(A2))>max(CntrX(A1 ),CntrX(A2)); 

(2) no other region overlaps their common bounding box 

S.Cck^reg.on 6 * *" *" * ^ "* °° « m W ^ ™»«* A3 a « < he ■*» 

Top(A3)<min(Top(A1 ).Top(A2)). and 
Bot(A3)>max(Bot(A1),Bot(A2)); and 

!s 4 a A blo a cktg 2 re a gron. ,OCked * ^ ^ *'* "° * ^ °' the neareS ' A3 at ,he 

If the regions Ai . A2 are vertically connected, their partial order is from the top to the bottom (t rom A1 to A2 in Fiq 8) 
reg,on A12. The boundmg box of the new region is the smallest rectangle covering both A1 and A2 so that 



EP 0 881 591 A1 



TopiA) = mir ( Top(Al) , Top(A2) ) , 

Lft(A) = rriin (LfC{Al) , Lft(A2) ) , 

Rgt(A) = mei>:(RcTt:(Al) ,Rft(A2) ) , and [ ~ q ' 4) 

Bot(^) - max(BotWD,J5ot(A2) ) . 

Other parameters (such as font size and spacing) tor the combined region could be transferred from the bigger of 
regions A1 . A2. 

The combining process can be repeated until no more connected regions are found. 
to In some cases, the order of the text regions in a page part can be identified just by consecutively combining 

connected text regions. For example, in the page layout shown in Fig. 9, the solution could be found by combining the 
text regions as follows: 

combine A5 and A6 into A56; 
is combine A56 and A7 into A567; 

combine A2 and A567 into A2567; 

combine A2567 and A8 into A25678; 

combine Al and A25678 into A1 25678; 

combine A1 25678 and A3 into A1 256783; 
20 combine A1 256783 and A4 into A1 2567834; 

The resultant order of the uncombined text blocks is then A1 , A2, A5, A6, A7, A8, A3, and A4. 

After all connected text regions are combined, if more than one text region remains in a page part, the order of 
the regions is determined by solving for the optimal Hamiltonian path of a graph G containing vertices V representing 
25 the uncombined text regions. 

For K text regions, this is accomplished by first constructing precedence functions f(Ai,Aj), i^j, i = 1-K, j = 1-K, for 
all the text regions. The precedence functions are used to assign lengths or weights to edges Ejj between vertices Vj 
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and V y 



The precedence function is defined as 



f(Ai,Aj) = K loc {Ai,Aj) 
+ K di AAi t Aj) 



K sep (Ai,Aj) , (Eq.5> 



where K loc evaluates the relative locations of the two text regions Ai and Aj; K dlf evaluates the similarity (in number of 
lines, font size, and spacing) of text regions; and K sep reflects the contribution to the function f due to the existence of 
a non-text separating region, if any, between Ai and Aj. How K loc , K dif , and are derived is described below. 
40 A graph G associated with a page part is defined as follows: G is a directed graph with K vertices V-, , V 2 , V k ; 

each pair of vertices Vj, Vj, i*j is connected by a directed edge E^; a non-negative number W(Ejj) (referred to as the 
weight or length of the edge Eij) is assigned to each edge Eyi 

45 W(E^ f(Ai,Aj), (Eq. 6) 

where f is the precedence function defined by Eq. 5. 

For a given order n (which is a permutation of numbers 1, 2, k), a Hamiltonian path P(n) in the graph G is an 
ordered set of vertices 



P(n) = {Vn(M Vk(2) Vn(k)}. (Eq. 7) 

The length of the path P(rc) is defined as 
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Lin) = WIE* (1) 7t (2) ) 
+ W[Ek (2) (3) ) 

+ . . . <eq- 8) 

+ W{Ek {k-1) n (k) ) . 
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The shortest Hamiltonian path is a Hamiltonian path with the minimal value L(n) 
nart E f Ch t ^ m,,ton,an Pfh P(n) in the associated graph G defines an order of text regions in the corresponding page 

, tV^Z S de,inrti ° n ° f the P recedence f ^ction f, the shorter the Hamiltonian path P{^S^Z 

the hkehhood that n ,s the proper logical order of text regions. Therefore, the shortest Hamiltonian path P(k) n the 
graph G provides the solution for finding the order n of text blocks in a page part 

be u2h n !T ^hhv H T iU ° nian Path: me Standard meth0d of reducin 9 jt to the trave,jn 9 salesman problem can 

E !S ?F TK ar l ?T T* V ° " 3dded t0 thG 9raph G> With the Vertex V o co ^ed to each vertex V, by edges 

IrLo Tc £rl e Th h ^ E * iS °" 1 e ' W(Eo ^ =W (E i°> =0 Next ' a sho « est ordered cycle C in the 

SSL nJrn^l t y aPP,y,n9 3 Stand3rd a,9 ° rithm for ^9 traVe ^ n 9 sa,esman P roble ™' The shortest Hamil- 
tonian path is then extracted from the cycle C by removing the additional vertex V 0 from the cycle 

are ZZl^naS^ ind6pendent partS P i- P * in the P a 9e have been identified, the paths *j, j =1 -n, 

n = {7c1,7r2,... ,jcd} Eq 9 

is lo^T^nS p P independe " t ' tt d ° eS not ma,,er how ,he orde ' s * are concatenated. However, an alternative 
» to sort the parts P ,n ,ncreas,ng order of y and then x where (x,y) is the top, left corner of each page part 

a fin* ^L COn .°?L enat f ° rdSr COmbined ,ext blocks in tG * re 9ions are separated out and p.aced in proper order in 

texTSSs aIat ' 6XamPle ' if n 18 (12 ' 3 ' ? ' 56 ' 4 '' " iS m0di ' ied 10 * = {1 ■ 2 ' 3 ' 7 ' 5 ' 6 4 > *> defi "° th ° of 

ITJl^T*'? "T K Pr ° VideS ,hS SOlUti ° n ,0f the Pr ° blem of identi <^9 the order of text blocks in a page 

Ai) K <A m Z * il 1 , P f reC : denC6 ,UnC,ionS f < Ai ' A i)' i = 1 -k, j = 1 -k are calculated based on values K loc (Ai, 
A JJ. K dif( A| .Aj), and K sep (Ai,Aj) for k text regions. 

A row preference or column preference can be selected. If row preference is selected then text region orderino 
favors ordenng ,n the X direction. If column preference is selected, the text region ordering tevo s ordering Z the Y 
d.rect.on. For reg,ons A1=(T1,B1,L1,R1) and A2=(T2.B2,L2,R2), the component K loc , which has a value that is de 

wtt K Ta T °" 6 , rela " Ve IOCa,i ° nS °' A1 and A2 ' K "*= (A1 is ^iculated differently than K loc (A2 A1) 

Teaionw f™ hT <Z ^''^ K '~ (A2 ' A1 > used to calculale '(A2.A1). Generally "because one text 

k??ai :Z Tut* ? A1 'fJ * USua " y not equa ' to ,(A2 ' A1 '> due to ,he d <" e ' e ™ s m 

S °n C r A h„^r V ' } calculation of K loc (A1,A2) or K loc (A2,A1) is set forth below for three possible cases 

f B 1 BP?> mT* Ti ^™ S6Para 6 re9i ° nS d ° ° Ver,aP eaCh °' her ' ,he C3Se Where min < R1 - R2 > S m ax(U ,L2) and min 
(B1,B2)>max(Tl,T2) is not possible and thus not considered 

A2; ,'hm ^ 1T2 ^Jb? 1 " ^ ^ ^ * ^ ° V6rlaP ' n ,be * or ^ 3X15 anc * AI is to the left of and below 
in this case, if column preference is selected, the value K toc is defined as: 

K kx (A1,A2) = 01 * \CntrX(A1) - CntrX(A2)\, (Eq 10) 

where Qi is a tunable parameter with a default value of 1; and 

K ^2.A1) = Q2 * IA7.A2I, (Eq n) 

where Q2 is a tunable parameter with default value 2 

leZ^ZT:::^ va,ue ,han K, - (A2 ' A1) ' wh,ch ,ends ,o ,avof A1 ° vei a2 ' ^ is ^ «- 

In the first case, if row preference is selected, the value K loc is defined as: 
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K lQC (A1 f A2) = Q3 * \A1,A2\, (Eq. 12) 

where Q3 is a tunable parameter with default value 4, and 

K {QC {A2,A1) = 04 * \A1 t A2i s (Eq. 13) 

where Q4 is a tunable with default value 1. K loc (A1,A2) has a larger value than K, oc (A2.A1), which tends to favor A2 
over A1 if row preference is selected in the first case. 

In a second case, the boundaries of the text regions A1 and A2 do not overlap in the X axis but overlap in the Y 
axis and the region A1 is to the left of region A2; that is, R1 < L2 and T1 < B2. 
In this case, if column preference is selected, the value K, oc is defined as: 

K foc (A1,A2)= Q1 * \CntrX(A1) - CntrX{A2)\, (Eq. 14) 

and 

K hc (A2,A1)= M1 t (Eq. 15) 

where M1 is a large value, which can be set to 

25 

M1 = 10 * max - l/V./tyl. (Eq. 16) 

M1 is thus defined as ten times the maximum possible distance between any two text regions in a considered part 
30 of the page. Generally, this heavily favors A1 over A2 in the calculation of K, oc . 

In the second case, if row preference is selected, the value K )oc is defined as: 

K k)C (A1 t A2) = Q4 * \A1,A2\, and (Eq. 17) 
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K !oc (A2 t A1)= M1, (Eq. 18) 



where Q4 is tunable with a default value of 1. Again, A1 is generally heavily favored over A2. 
40 In a third case, the boundaries of the text regions do not overlap in the Y axis but overlap in the X axis and A1 is 

located above A2; that is, B1 < T2 and min(R1 ,R2) > max(L1 ,L2). 

In this case, for both column and row preferences, the value K| OC is defined as: 

K.(A1,A2) = Q5 * IA1,A2I, and (Eq. 19) 

45 

K {oc (A2,A1)= M1 % (Eq. 20) 

50 where Q5 is a tunable parameter with a default value 1 . These calculations generally heavily favor A1 over A2. 
The function K djf (A1 ,A2) is defined as 

K d ,AAl,A2) = 06 * (rr\ + m y ) 

* <l*x - il + Mi - J a l>. 

where nr^ is the number of text lines in region Ai, Sj is the texl point size for region Ai, lj is the distance between consecutive 
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rtT^K^V^ tUnab 'f Pa / ameter (de,ault va,ue is 1 °) ^ »i and I; represent the height (in the Y directs) 
of a line. K djf (A2,A1 ) is equal to K djf (A1,A2). } 

hnn7o h ntlTvt ti0n K * e » (A1 ' A2) » defined re,ative to a separator (non-text region) B and is calculated in terms of a 

B, the horizontal extrusion parameter E hor (A,B) is defined as 

E hoz {A.B) =max[LftU) -Lft{B), Rgt {B) - Rgt {A) ] / 

11 U/c(B) < Lft(A) and Rat (B) > Lft(A)^ 
0, otherwise; l * fftlB) > l ** t{A) "*Zft(i) 



- - (Eq. 22) 

hi thll h f°» r(A H ) jS K? re ! 1er th f u Zer ° * eUher ° f the ,e,t ° r n9ht ed96S ° f the 1ext re 9 lon A falls wlthin range defined 
by the left and right edges of the separator B. 

Similarly, a vertical extrusion parameter E vert (A,B) is defined as 

E vezc (A,B) = max [ Top U) - Top(B), Bot(B) -Bot(A)] / 
lBot(A) - Top (A) ] , 



0 r otherwise 



if lTop(B) < Tcp(A) and Bot(B) > Top(A)] 
or [Bot(B) > -otU) and Top(B) < Sot (A) ] ; 



40 (Eq. 23) 

The function K^AI , A2) is defined as follows for the following two possible cases 

In a first case, the text regions A1 , A2 are vertically disjoint; that is, the regions A1 , A2 do not overlap in the Y axis 
defined by min(Bot(A1 ,Bot(A2)) < max(Top(A1 ),Top(A2». In this case P ' 

45 

K ssp (Al,A2) = Q7 >= E B (E hor (Al,B) 

- E hor (A2,B) ) , • (Eq. 24) 



where Q7 is a .unable parameter (default can be 10) and the sum Z B includes all separators between A1 and A2; i.e., 



Top(B)>min(Bot(A1),Bot(A2)), and 
Bot(B)<max(Top(A1),Top(A2)). 



In a second case, the text regions are horizontally disjoint; that is, the regions A1.A2 do not overlap on the X axis 
as detined by min(Rgt(A1 ),Rgt(A2) < max(Lft(A1),Lft(A2)). In this case, 
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K sep (Al,A2) = Q7 . ZAE ve: AAl,B) 

* E VSI AA2.B) ) . <Eq - 2B) 

where the sum I B includes all separators between A1 and A2; i.e., 

Lft(B)>min(Rgt(A1),Rgt(A2)), and 
Rgt(B)<max(Lft(A1 ),Lft(A2)). 



Once K, oc , K dif , and K, oc are calculated for all combinations of A1,A2,...,Ak, the precedence functions f(Ai,Aj), i*j, 
i=|-k, j=l-k, can be constructed and used in finding the lengths of different permutations of paths P(n) to identify the 
shortest Hamiltonian path P(n). 

Referring to Fig. 10, the text capturing and ordering program may be implemented in digital electronic circuitry or 

15 in computer hardware, firmware, software, or in combinations of them, such as in a computer system. The computer 
system includes a central processing unit (CPU) 502 connected to an internal system bus 504. The storage media in 
the computer system include a main memory 506 (which can be implemented with dynamic random access memory 
devices), a hard disk drive 508 for mass storage, and a non-volatile memory (NVRAM) 510. The main memory 506 
and NVRAM 510 are connected to the bus 504, and the hard disk drive 508 is coupled to the bus 504 through a hard 

20 disk drive controller 512. 

Apparatus of the invention may be implemented in a computer program product tangibly embodied in a machine- 
readable storage device (such as the hard disk drive 508, main memory 506, or NVRAM 510) for execution by the 
CPU 502. Suitable processors include, by way of example, both general and special purpose microprocessors. Gen- 
erally, a processor will receive instructions and data from the read-only memory 510 and/or the main memory 506. 

2S Storage devices suitable for tangibly embodying computer programming instructions include all forms of non-volatile 
memory, including by way of example semiconductor memory devices, such as EPROM, EEPROM, and flash memory 
devices; magnetic disks such as the internal hard disk drive 508 and removable disks and diskettes 528 connected 
through a controller 526; magneto-optical disks; and CD-ROM disks. Any of the foregoing may be supplemented by, 
or incorporated in, specially-designed ASICs (application-specific integrated circuits). 

30 The computer system further includes an input-output (I/O) controller 514 connected to the bus 504 and which 

provides a keyboard interface 516 for connection to an external keyboard, a mouse interface 518 for connection to an 
external mouse or other pointer device, and a parallel port interface 520 for connection to a printer. In addition, the bus 
504 is connected to a video controller 522 which couples to an externa! computer monitor or display 524. Data asso- 
ciated with an image for display on a computer monitor 524 are provided over the system bus 504 by application 

35 programs to the video controller 522 through the operating system and the appropriate device driver. 

Other embodiments are within the scope of the following claims. For example, the order of the steps of the invention 
may be changed by those skilled in the art and still achieve desirable results. Different techniques can be used to 
identify an optimal path between vertices of a graph representing text blocks or regions in an image. Although specific 
equations and parameters have been disclosed to determine variables used in finding an optimal order of text blocks 

40 or regions, such equations and parameters can be changed. 



Claims 

1 . A computer-implemented method ordering text in an image stored in a computer, the text being grouped in multiple 
blocks, the method comprising: 

grouping the text in multiple regions; 

representing the text regions as a graph having vertices and edges; 

calculating an optimal Hamiltonian path through the vertices; and 

ordering the text regions according to the calculated optimal Hamiltonian path. 

2. The method of claim 1 , wherein the representing step includes: 

defining each text region as a vertex in a graph; 
defining edges between the vertices; and 
assigning weights to the edges. 



9 
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3. The method of claim 2, wherein oriented pairs ot edges are defined between any two vertices. 

4. The method of claim 2, further comprising: 
defining a graph including the vertices and edges: 

solving for a traveling salesman problem to calculate the optimal Hamiltonian pa.h in the graph. 

5 - sis^i^r^^ assisned ,he edaes beiween ,he — - ^ « - *«— 

6 .srrssr ass,9ned ,he edses be,ween ver,ices - « - « —o. 

7. The method of claim 6, wherein the text characteristics include font size and number of lines of text. 

8 ' l^ZTV ^ C ' aim 2 ' Wherein ,he wei 9 hts aligned the edges between the vertices are based on the existence 
of non-text separators between text region pairs. existence 

9. The method of claim 8, wherein the separators include graphical objects. 

10. The method of claim 1, further comprising: 

identifying text blocks that can be combined; and 
combining the text blocks into a text region. 

11. The method of claim 10, wherein two text blocks can be combined if they are vertically connected. 

12. The method of claim 10, wherein two text blocks can be combined if they are horizontally connected. 

13. The method of claim 1 , further comprising: 

separating the image into independent parts, each part containing its own set of text regions and 
pendent'ly 9 9r ° UP ' n9, repreSen,in9 ' calcul ^9, and ordering on the set of text regions on each part inde- 

v 

1 4. The method of claim 1 3, wherein the image is separated by identifying predetermined types of non-text separators. 

15. The method of claim 13, further comprising: 
concatenating orders of text regions identified for the different parts. 

16. A program residing on a computer-readable medium for ordering text in an image stored in a computer the oroaram 
comprising instructions for causing the computer to: computer, the program 

group the text in multiple regions; 

represent the text regions as a graph having vertices and edges; 
calculate an optimal Hamiltonian path through the vertices; and 
order the text regions according to the calculated optimal Hamiltonian path. 

17. The program of claim 16, wherein the representing includes: 

defining each text region as a vertex in a graph; 
defining edges between the vertices; and 
assigning weights to the edges. 

55 1 8 ' h^Z r ° 9ram °' C ' ai T 1 ? ' Wherei " WSi9h,S aSSi9ned ,he ed 9 es b6twee " ,he v ^i^s are based on the distance 
between corresponding text blocks and the characteristics of each block. ^stance 

19. The program of claim 18, wherein the weights assigned the edges between the vertices are further based on the 
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existence of separators between the text block pairs. 

20. The program claim 16, wherein the program comprises instructions for further causing the computer to: 

identify blocks of text that can be combined; and 
combine the text blocks into a text region. 

21. The program of claim 16, wherein the program comprises instructions for further causing the computer to: 

separate the image into independent parts; and 

perform the grouping, representing, calculating, and ordering on each part. 

22. The program of claim 21 , wherein the program comprises instructions for further causing the computer to concate- 
nate orders of text regions identified for the different parts. 

23. Apparatus for recognizing text in an image, comprising: 

a storage medium to store the image; and 

a processor operatively coupled to the storage medium and configured to: 
group the text in multiple regions; 

represent the text regions in a graph having vertices and edges; 
calculate an optimal Hamiltonian path through the vertices; and 
order the text regions according to the calculated optimal Hamiltonian path. 

24. The apparatus of claim 23, wherein the representing includes: 

defining each text block as a vertex in a graph; 
defining edges between the vertices; and 
assigning weights to the edges. 

25. The apparatus of claim 24, wherein the weights assigned the edges between the vertices are based on the distance 
between any two text regions. 

26. The apparatus of claim 25, wherein the weights assigned the edges between the vertices are further based on the 
existence of separators between the text block pairs. 

27. A method implemented in a computer for ordering text in an image stored int he computer, the method comprising: 

identifying a set of text blocks; 

separating the set of text blocks into independent subsets of text blocks; 
representing the text blocks as vertices in a graph in each subset; 
defining directed edges between vertices in each subset; 
finding an optimal Hamiltonian path through the graph in each subset; 

determining the order of the text blocks in each subset based on the optimal Hamiltonian path: and 
combining the orders of text blocks in the subsets into a final order. 
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